Search Results

Documents authored by Koß, Tore


Document
k-Universality of Regular Languages

Authors: Duncan Adamson, Pamela Fleischmann, Annika Huch, Tore Koß, Florin Manea, and Dirk Nowotka

Published in: LIPIcs, Volume 283, 34th International Symposium on Algorithms and Computation (ISAAC 2023)


Abstract
A subsequence of a word w is a word u such that u = w[i₁] w[i₂] … w[i_k], for some set of indices 1 ≤ i₁ < i₂ < … < i_k ≤ |w|. A word w is k-subsequence universal over an alphabet Σ if every word in Σ^k appears in w as a subsequence. In this paper, we study the intersection between the set of k-subsequence universal words over some alphabet Σ and regular languages over Σ. We call a regular language L k-∃-subsequence universal if there exists a k-subsequence universal word in L, and k-∀-subsequence universal if every word of L is k-subsequence universal. We give algorithms solving the problems of deciding if a given regular language, represented by a finite automaton recognising it, is k-∃-subsequence universal and, respectively, if it is k-∀-subsequence universal, for a given k. The algorithms are FPT w.r.t. the size of the input alphabet, and their run-time does not depend on k; they run in polynomial time in the number n of states of the input automaton when the size of the input alphabet is O(log n). Moreover, we show that the problem of deciding if a given regular language is k-∃-subsequence universal is NP-complete, when the language is over a large alphabet. Further, we provide algorithms for counting the number of k-subsequence universal words (paths) accepted by a given deterministic (respectively, nondeterministic) finite automaton, and ranking an input word (path) within the set of k-subsequence universal words accepted by a given finite automaton.

Cite as

Duncan Adamson, Pamela Fleischmann, Annika Huch, Tore Koß, Florin Manea, and Dirk Nowotka. k-Universality of Regular Languages. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 4:1-4:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{adamson_et_al:LIPIcs.ISAAC.2023.4,
  author =	{Adamson, Duncan and Fleischmann, Pamela and Huch, Annika and Ko{\ss}, Tore and Manea, Florin and Nowotka, Dirk},
  title =	{{k-Universality of Regular Languages}},
  booktitle =	{34th International Symposium on Algorithms and Computation (ISAAC 2023)},
  pages =	{4:1--4:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-289-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{283},
  editor =	{Iwata, Satoru and Kakimura, Naonori},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2023.4},
  URN =		{urn:nbn:de:0030-drops-193064},
  doi =		{10.4230/LIPIcs.ISAAC.2023.4},
  annote =	{Keywords: String Algorithms, Regular Languages, Finite Automata, Subsequences}
}
Document
The Edit Distance to k-Subsequence Universality

Authors: Joel D. Day, Pamela Fleischmann, Maria Kosche, Tore Koß, Florin Manea, and Stefan Siemer

Published in: LIPIcs, Volume 187, 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)


Abstract
A word u is a subsequence of another word w if u can be obtained from w by deleting some of its letters. In the early 1970s, Imre Simon defined the relation ∼_k (called now Simon-Congruence) as follows: two words having exactly the same set of subsequences of length at most k are ∼_k-congruent. This relation was central in defining and analysing piecewise testable languages, but has found many applications in areas such as algorithmic learning theory, databases theory, or computational linguistics. Recently, it was shown that testing whether two words are ∼_k-congruent can be done in optimal linear time. Thus, it is a natural next step to ask, for two words w and u which are not ∼_k-equivalent, what is the minimal number of edit operations that we need to perform on w in order to obtain a word which is ∼_k-equivalent to u. In this paper, we consider this problem in a setting which seems interesting: when u is a k-subsequence universal word. A word u with alph(u) = Σ is called k-subsequence universal if the set of subsequences of length k of u contains all possible words of length k over Σ. As such, our results are a series of efficient algorithms computing the edit distance from w to the language of k-subsequence universal words.

Cite as

Joel D. Day, Pamela Fleischmann, Maria Kosche, Tore Koß, Florin Manea, and Stefan Siemer. The Edit Distance to k-Subsequence Universality. In 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 187, pp. 25:1-25:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{day_et_al:LIPIcs.STACS.2021.25,
  author =	{Day, Joel D. and Fleischmann, Pamela and Kosche, Maria and Ko{\ss}, Tore and Manea, Florin and Siemer, Stefan},
  title =	{{The Edit Distance to k-Subsequence Universality}},
  booktitle =	{38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)},
  pages =	{25:1--25:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-180-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{187},
  editor =	{Bl\"{a}ser, Markus and Monmege, Benjamin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2021.25},
  URN =		{urn:nbn:de:0030-drops-136705},
  doi =		{10.4230/LIPIcs.STACS.2021.25},
  annote =	{Keywords: Subsequence, Scattered factor, Subword, Universality, k-subsequence universality, Edit distance, Efficient algorithms}
}
Document
Efficiently Testing Simon’s Congruence

Authors: Paweł Gawrychowski, Maria Kosche, Tore Koß, Florin Manea, and Stefan Siemer

Published in: LIPIcs, Volume 187, 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)


Abstract
Simon’s congruence ∼_k is a relation on words defined by Imre Simon in the 1970s and intensely studied since then. This congruence was initially used in connection to piecewise testable languages, but also found many applications in, e.g., learning theory, databases theory, or linguistics. The ∼_k-relation is defined as follows: two words are ∼_k-congruent if they have the same set of subsequences of length at most k. A long standing open problem, stated already by Simon in his initial works on this topic, was to design an algorithm which computes, given two words s and t, the largest k for which s∼_k t. We propose the first algorithm solving this problem in linear time O(|s|+|t|) when the input words are over the integer alphabet {1,…,|s|+|t|} (or other alphabets which can be sorted in linear time). Our approach can be extended to an optimal algorithm in the case of general alphabets as well. To achieve these results, we introduce a novel data-structure, called Simon-Tree, which allows us to construct a natural representation of the equivalence classes induced by ∼_k on the set of suffixes of a word, for all k ≥ 1. We show that such a tree can be constructed for an input word in linear time. Then, when working with two words s and t, we compute their respective Simon-Trees and efficiently build a correspondence between the nodes of these trees. This correspondence, which can also be constructed in linear time O(|s|+|t|), allows us to retrieve the largest k for which s∼_k t.

Cite as

Paweł Gawrychowski, Maria Kosche, Tore Koß, Florin Manea, and Stefan Siemer. Efficiently Testing Simon’s Congruence. In 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 187, pp. 34:1-34:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.STACS.2021.34,
  author =	{Gawrychowski, Pawe{\l} and Kosche, Maria and Ko{\ss}, Tore and Manea, Florin and Siemer, Stefan},
  title =	{{Efficiently Testing Simon’s Congruence}},
  booktitle =	{38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)},
  pages =	{34:1--34:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-180-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{187},
  editor =	{Bl\"{a}ser, Markus and Monmege, Benjamin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2021.34},
  URN =		{urn:nbn:de:0030-drops-136796},
  doi =		{10.4230/LIPIcs.STACS.2021.34},
  annote =	{Keywords: Simon’s congruence, Subsequence, Scattered factor, Efficient algorithms}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail